149 research outputs found

    A Deep Sequential Model for Discourse Parsing on Multi-Party Dialogues

    Full text link
    Discourse structures are beneficial for various NLP tasks such as dialogue understanding, question answering, sentiment analysis, and so on. This paper presents a deep sequential model for parsing discourse dependency structures of multi-party dialogues. The proposed model aims to construct a discourse dependency tree by predicting dependency relations and constructing the discourse structure jointly and alternately. It makes a sequential scan of the Elementary Discourse Units (EDUs) in a dialogue. For each EDU, the model decides to which previous EDU the current one should link and what the corresponding relation type is. The predicted link and relation type are then used to build the discourse structure incrementally with a structured encoder. During link prediction and relation classification, the model utilizes not only local information that represents the concerned EDUs, but also global information that encodes the EDU sequence and the discourse structure that is already built at the current step. Experiments show that the proposed model outperforms all the state-of-the-art baselines.Comment: Accepted to AAAI 201

    Robustness to Modification with Shared Words in Paraphrase Identification

    Full text link
    Revealing the robustness issues of natural language processing models and improving their robustness is important to their performance under difficult situations. In this paper, we study the robustness of paraphrase identification models from a new perspective -- via modification with shared words, and we show that the models have significant robustness issues when facing such modifications. To modify an example consisting of a sentence pair, we either replace some words shared by both sentences or introduce new shared words. We aim to construct a valid new example such that a target model makes a wrong prediction. To find a modification solution, we use beam search constrained by heuristic rules, and we leverage a BERT masked language model for generating substitution words compatible with the context. Experiments show that the performance of the target models has a dramatic drop on the modified examples, thereby revealing the robustness issue. We also show that adversarial training can mitigate this issue.Comment: Findings of EMNLP 202

    Robustly Leveraging Prior Knowledge in Text Classification

    Full text link
    Prior knowledge has been shown very useful to address many natural language processing tasks. Many approaches have been proposed to formalise a variety of knowledge, however, whether the proposed approach is robust or sensitive to the knowledge supplied to the model has rarely been discussed. In this paper, we propose three regularization terms on top of generalized expectation criteria, and conduct extensive experiments to justify the robustness of the proposed methods. Experimental results demonstrate that our proposed methods obtain remarkable improvements and are much more robust than baselines

    From One Point to A Manifold: Knowledge Graph Embedding For Precise Link Prediction

    Full text link
    Knowledge graph embedding aims at offering a numerical knowledge representation paradigm by transforming the entities and relations into continuous vector space. However, existing methods could not characterize the knowledge graph in a fine degree to make a precise prediction. There are two reasons: being an ill-posed algebraic system and applying an overstrict geometric form. As precise prediction is critical, we propose an manifold-based embedding principle (\textbf{ManifoldE}) which could be treated as a well-posed algebraic system that expands the position of golden triples from one point in current models to a manifold in ours. Extensive experiments show that the proposed models achieve substantial improvements against the state-of-the-art baselines especially for the precise prediction task, and yet maintain high efficiency.Comment: arXiv admin note: text overlap with arXiv:1509.0548

    An Interpretable Reasoning Network for Multi-Relation Question Answering

    Full text link
    Multi-relation Question Answering is a challenging task, due to the requirement of elaborated analysis on questions and reasoning over multiple fact triples in knowledge base. In this paper, we present a novel model called Interpretable Reasoning Network that employs an interpretable, hop-by-hop reasoning process for question answering. The model dynamically decides which part of an input question should be analyzed at each hop; predicts a relation that corresponds to the current parsed results; utilizes the predicted relation to update the question representation and the state of the reasoning process; and then drives the next-hop reasoning. Experiments show that our model yields state-of-the-art results on two datasets. More interestingly, the model can offer traceable and observable intermediate predictions for reasoning analysis and failure diagnosis, thereby allowing manual manipulation in predicting the final answer.Comment: COLING 2018, 13page

    SSP: Semantic Space Projection for Knowledge Graph Embedding with Text Descriptions

    Full text link
    Knowledge representation is an important, long-history topic in AI, and there have been a large amount of work for knowledge graph embedding which projects symbolic entities and relations into low-dimensional, real-valued vector space. However, most embedding methods merely concentrate on data fitting and ignore the explicit semantic expression, leading to uninterpretable representations. Thus, traditional embedding methods have limited potentials for many applications such as question answering, and entity classification. To this end, this paper proposes a semantic representation method for knowledge graph \textbf{(KSR)}, which imposes a two-level hierarchical generative process that globally extracts many aspects and then locally assigns a specific category in each aspect for every triple. Since both aspects and categories are semantics-relevant, the collection of categories in each aspect is treated as the semantic representation of this triple. Extensive experiments justify our model outperforms other state-of-the-art baselines substantially.Comment: Submitted to AAAI.201

    Story Ending Generation with Incremental Encoding and Commonsense Knowledge

    Full text link
    Generating a reasonable ending for a given story context, i.e., story ending generation, is a strong indication of story comprehension. This task requires not only to understand the context clues which play an important role in planning the plot but also to handle implicit knowledge to make a reasonable, coherent story. In this paper, we devise a novel model for story ending generation. The model adopts an incremental encoding scheme to represent context clues which are spanning in the story context. In addition, commonsense knowledge is applied through multi-source attention to facilitate story comprehension, and thus to help generate coherent and reasonable endings. Through building context clues and using implicit knowledge, the model is able to produce reasonable story endings. context clues implied in the post and make the inference based on it. Automatic and manual evaluation shows that our model can generate more reasonable story endings than state-of-the-art baselines.Comment: Accepted in AAAI201

    Modeling Rich Contexts for Sentiment Classification with LSTM

    Full text link
    Sentiment analysis on social media data such as tweets and weibo has become a very important and challenging task. Due to the intrinsic properties of such data, tweets are short, noisy, and of divergent topics, and sentiment classification on these data requires to modeling various contexts such as the retweet/reply history of a tweet, and the social context about authors and relationships. While few prior study has approached the issue of modeling contexts in tweet, this paper proposes to use a hierarchical LSTM to model rich contexts in tweet, particularly long-range context. Experimental results show that contexts can help us to perform sentiment classification remarkably better

    UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

    Full text link
    Despite the success of existing referenced metrics (e.g., BLEU and MoverScore), they correlate poorly with human judgments for open-ended text generation including story or dialog generation because of the notorious one-to-many issue: there are many plausible outputs for the same input, which may differ substantially in literal or semantics from the limited number of given references. To alleviate this issue, we propose UNION, a learnable unreferenced metric for evaluating open-ended story generation, which measures the quality of a generated story without any reference. Built on top of BERT, UNION is trained to distinguish human-written stories from negative samples and recover the perturbation in negative stories. We propose an approach of constructing negative samples by mimicking the errors commonly observed in existing NLG models, including repeated plots, conflicting logic, and long-range incoherence. Experiments on two story datasets demonstrate that UNION is a reliable measure for evaluating the quality of generated stories, which correlates better with human judgments and is more generalizable than existing state-of-the-art metrics.Comment: Long paper; Accepted by EMNLP202

    TransA: An Adaptive Approach for Knowledge Graph Embedding

    Full text link
    Knowledge representation is a major topic in AI, and many studies attempt to represent entities and relations of knowledge base in a continuous vector space. Among these attempts, translation-based methods build entity and relation vectors by minimizing the translation loss from a head entity to a tail one. In spite of the success of these methods, translation-based methods also suffer from the oversimplified loss metric, and are not competitive enough to model various and complex entities/relations in knowledge bases. To address this issue, we propose \textbf{TransA}, an adaptive metric approach for embedding, utilizing the metric learning ideas to provide a more flexible embedding method. Experiments are conducted on the benchmark datasets and our proposed method makes significant and consistent improvements over the state-of-the-art baselines
    • …
    corecore